Search CORE

41 research outputs found

Memory and Parallelism Analysis Using a Platform-Independent Approach

Author: Awan Ahsan Javed
Corda Stefano
Corporaal Henk
Jordans Roel
Singh Gagandeep
Stuijk Sander
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

Emerging computing architectures such as near-memory computing (NMC) promise improved performance for applications by reducing the data movement between CPU and memory. However, detecting such applications is not a trivial task. In this ongoing work, we extend the state-of-the-art platform-independent software analysis tool with NMC related metrics such as memory entropy, spatial locality, data-level, and basic-block-level parallelism. These metrics help to identify the applications more suitable for NMC architectures.Comment: 22nd ACM International Workshop on Software and Compilers for Embedded Systems (SCOPES '19), May 201

arXiv.org e-Print Archive

Crossref

Repository TU/e

Pure OAI Repository

PET-to-MLIR:A polyhedral front-end for MLIR

Author: Chelini Lorenzo
Corporaal Henk
Jordans Roel
Komisarczyk Konrad
Vadivel Kanishkan
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 08/10/2020
Field of study

We present PET-to-MLIR, a new tool to enter the MLIR compiler framework from C source. The tool is based on the popular PET and ISL libraries for extracting and manipulating quasi-affine sets and relations, and Loop Tactics, a declarative optimizer. The use of PET brings advanced diagnosis and full support for C by relying on the Clang parser. ISL allows easy manipulation of the polyhedral representation and efficient code generation. Loop Tactics, on the other hand, enable us to detect computational motifs transparently and lift the entry point in MLIR, thus enabling domain-specific optimizations in general-purpose code.We demonstrate our tool using the Polybench/C benchmark suite and show that it can lower most of the benchmarks to the MLIR’s affine dialect successfully. We believe that our tool can benefit research in the compiler community by providing an automatic way to translate C code to the MLIR affine dialect

Crossref

Pure OAI Repository

A generic methodology to compute design sensitivity to SEU in SRAM-Based FPGA

Author: Corporaal Henk
Jordans Roel
Mousavi Mahsa
Pourshaghaghi Hamid Reza
Tahghighi Mohammad
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 12/10/2018
Field of study

Pure OAI Repository

NMPO:Near-Memory Computing Profiling and Offloading

Author: Awan Ahsan Javed
Corda Stefano
Corporaal Henk
Jordans Roel
Kumar Akash
Kumaraswamy Madhurya
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 11/10/2021
Field of study

Real-world applications are now processing big-data sets, often bottlenecked by the data movement between the compute units and the main memory. Near-memory computing (NMC), a modern data-centric computational paradigm, can alleviate these bottlenecks, thereby improving the performance of applications. The lack of NMC system availability makes simulators the primary evaluation tool for performance estimation. However, simulators are usually time-consuming, and methods that can reduce this overhead would accelerate the early-stage design process of NMC systems. This work proposes Near-Memory computing Profiling and Offloading (NMPO), a high-level framework capable of predicting NMC offloading suitability employing an ensemble machine learning model. NMPO predicts NMC suitability with an accuracy of 85.6% and, compared to prior works, can reduce the prediction time by using hardware-dependent applications features by up to 3 order of magnitude

Pure OAI Repository

Near Memory Acceleration on High Resolution Radio Astronomy Imaging

Author: Awan Ahsan Javed
Corda Stefano
Corporaal Henk
Jordans Roel
Kumar Akash
Veenboer Bram
Publication venue
Publication date: 04/05/2020
Field of study

Modern radio telescopes like the Square Kilometer Array (SKA) will need to process in real-time exabytes of radio-astronomical signals to construct a high-resolution map of the sky. Near-Memory Computing (NMC) could alleviate the performance bottlenecks due to frequent memory accesses in a state-of-the-art radio-astronomy imaging algorithm. In this paper, we show that a sub-module performing a two-dimensional fast Fourier transform (2D FFT) is memory bound using CPI breakdown analysis on IBM Power9. Then, we present an NMC approach on FPGA for 2D FFT that outperforms a CPU by up to a factor of 120x and performs comparably to a high-end GPU, while using less bandwidth and memory

arXiv.org e-Print Archive

Pure OAI Repository

TDO-CIM: Transparent Detection and Offloading for Computation In-memory

Author: BanaGozar Ali
Chelini Lorenzo
Corda Stefano
Corporaal Henk
Jordans Roel
Singh Gagandeep
Vadivel Kanishkan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

Computation in-memory is a promising non-von Neumann approach aiming at completely diminishing the data transfer to and from the memory subsystem. Although a lot of architectures have been proposed, compiler support for such architectures is still lagging behind. In this paper, we close this gap by proposing an end-to-end compilation flow for in-memory computing based on the LLVM compiler infrastructure. Starting from sequential code, our approach automatically detects, optimizes, and offloads kernels suitable for in-memory acceleration. We demonstrate our compiler tool-flow on the PolyBench/C benchmark suite and evaluate the benefits of our proposed in-memory architecture simulated in Gem5 by comparing it with a state-of-the-art von Neumann architecture.Comment: Full version of DATE2020 publicatio

arXiv.org e-Print Archive

Repository TU/e

Pure OAI Repository

Prebypass: Software Register File Bypassing for Reduced Interconnection Architectures

Author: Corporaal Henk
de Bruin Barry
Jordans Roel
Jääskeläinen P.
Vadivel Kanishkan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2022
Field of study

Exposed Datapath Architectures (EDPAs) with aggressively pruned data-path connectivity, where not all function units in the design have connections to a centralized register file, are promising solutions for energy-efficient computation. A direct bypassing of data between function units without temporary copies to the register file is a prime optimization for programming such architectures. However, traditional compiler frameworks, such as LLVM, assume function-units connect to register-files and allocate all live variables in register-files. This leads to schedule inefficiencies in terms of instruction-level parallelism and reg-ister accesses in the EDPAs. To address these inefficiencies, we propose Prebypass; a new optimization pass for EDPA compiler backends. Experimental results on an EDPA class of architecture, Transport- Triggered Architecture, show that Prebypass improves the runtime, register reads, and register writes up to 16%, 26 %, and 37 % respectively, when the datapath is extremely pruned. Evaluation in a 28-nm FDSOI technology reveals that Prebypass improves the core-level Energy by 17.5 % over the current heuristic scheduler.acceptedVersionPeer reviewe

Pure OAI Repository

Trepo - Institutional Repository of Tampere University

Exploring processor parallelism: Estimation methods and optimization strategies,”

Author: Lech Jóźwiak
Member IEEE Henk Corporaal
Rosilde Corvino
Student Member IEEE Roel Jordans
Publication venue
Publication date: 01/01/2013
Field of study

Abstract-Automatic optimization of application-specific instruction-set processor (ASIP) architectures mostly focuses on the internal memory hierarchy design, or the extension of reduced instruction-set architectures with complex custom operations. This paper focuses on very long instruction word (VLIW) architectures and, more specifically, on automating the selection of an application specific VLIW issue-width. The issuewidth selection strongly influences all the important processor properties (e.g. processing speed, silicon area, and power consumption). Therefore, an accurate and efficient issue-width estimation and optimization are some of the most important aspects of VLIW ASIP design. In this paper, we first compare different methods for the estimation of required the issue-width, and subsequently introduce a new force-based parallelism estimation method which is capable of estimating the required issue-width with only 3% error on average. Furthermore, we present and compare two techniques for estimating the required issue-width of software pipelined loop kernels and show that a simple utilization-based measure provides an error margin of less than 1% on average

CiteSeerX

Hauptsätze der Differential- und Integral-Rechnung : als Leitfaden zum Gebrauch bei Vorlesungen / zusammengestellt von Robert Fricke ; 1. Theil

Author: Awan Ahsan Javed
Boonstra Albert Jan
Chelini L
Corda S
Corporaal H Henk
Jordans R Roel
Singh G
Stuijk Sander Sander
Publication venue: Vieweg
Publication date: 01/01/1897
Field of study

\u3cp\u3eThe conventional approach of moving data to the CPU for computation has become a significant performance bottleneck for emerging scale-out data-intensive applications due to their limited data reuse. At the same time, the advancement in 3D integration technologies has made the decade-old concept of coupling compute units close to the memory — called near-memory computing (NMC) — more viable. Processing right at the “home” of data can significantly diminish the data movement problem of data-intensive applications. In this paper, we survey the prior art on NMC across various dimensions (architecture, applications, tools, etc.) and identify the key challenges and open issues with future research directions. We also provide a glimpse of our approach to near-memory computing that includes i) NMC specific microarchitecture independent application characterization ii) a compiler framework to offload the NMC kernels on our target NMC platform and iii) an analytical model to evaluate the potential of NMC.\u3c/p\u3

Digitale Bibliothek Braunschweig

arXiv.org e-Print Archive

Repository TU/e

Pure OAI Repository